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Abstract 

Background: The quantification of experimentally-induced alterations in biological pathways remains a major 
challenge in systems biology. One example of this is the quantitative characterization of alterations in defined, 
established metabolic pathways from complex metabolomic data. At present, the disruption of a given metabolic 
pathway is inferred from metabolomic data by observing an alteration in the level of one or more individual 
metabolites present within that pathway. Not only is this approach open to subjectivity, as metabolites participate 
in multiple pathways, but it also ignores useful information available through the pairwise correlations between 
metabolites. This extra information may be incorporated using a higher-level approach that looks for alterations 
between a pair of correlation networks. In this way experimentally-induced alterations in metabolic pathways can 
be quantitatively defined by characterizing group differences in metabolite clustering. Taking this approach 
increases the objectivity of interpreting alterations in metabolic pathways from metabolomic data. 

Results: We present and justify a new technique for comparing pairs of networks-in our case these networks are 
based on the same set of nodes and there are two distinct types of weighted edges. The algorithm is based on 
the Generalized Singular Value Decomposition (GSVD), which may be regarded as an extension of Principle 
Components Analysis to the case of two data sets. We show how the GSVD can be interpreted as a technique for 
reordering the two networks in order to reveal clusters that are exclusive to only one. Here we apply this 
algorithm to a new set of metabolomic data from the prefrontal cortex (PFC) of a translational model relevant to 
schizophrenia, rats treated subchronically with the N-methyl-D-Aspartic acid (NMDA) receptor antagonist 
phencyclidine (PGP). This provides us with a means to quantify which predefined metabolic pathways (Kyoto 
Encyclopedia of Genes and Genomes (KEGG) metabolite pathway database) were altered in the PFC of PCP-treated 
rats. Several significant changes were discovered, notably: 1) neuroactive ligands active at glutamate and GABA 
receptors are disrupted in the PFC of PCP-treated animals, 2) glutamate dysfunction in these animals was not 
limited to compromised glutamatergic neurotransmission but also involves the disruption of metabolic pathways 
linked to glutamate; and 3) a specific series of purine reactions Xanthine <— Hypoxyanthine Inosine <r- IMP — > 
adenylosuccinate is also disrupted in the PFC of PCP-treated animals. 

Conclusions: Network reordering via the GSVD provides a means to discover statistically validated differences in 
clustering between a pair of networks. In practice this analytical approach, when applied to metabolomic data, 
allows us to quantify the alterations in metabolic pathways between two experimental groups. With this new 
computational technique we identified metabolic pathway alterations that are consistent with known results. 
Furthermore, we discovered disruption in a novel series of purine reactions that may contribute to the PFC 
dysfunction and cognitive deficits seen in schizophrenia. 
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Background 

Background in neuroscience and metabolomics 

Schizophrenia is characterized by deficits in cognition 
known to be dependent upon the functional integrity of 
the prefrontal cortex (PFC). Furthermore, compromised 
PFC function in schizophrenia is supported by a multitude 
of neuroimaging studies reporting hypometabolism ('hypo- 
frontality'), as evidenced by decreased blood flow or glu- 
cose utilization [1,2]. While the pathophysiological basis of 
PFC dysfunction in schizophrenia is not completely under- 
stood, a central role for NMDA receptor hypofunction is 
widely supported. For example, subchronic exposure to the 
NMDA receptor antagonist phencyclidine (PCP) induces 
cognitive deficits and a 'hypofrontality' which directly par- 
allels that seen in schizophrenia [3-5] . Furthermore, sub- 
chronic PCP exposure induces alterations in GABAergic 
cell markers and 5-HT receptor expression in the PFC 
similar to those seen in this disorder [3,6,7]. While this evi- 
dence places NMDA receptor hypofunction central to the 
pathophysiology of PFC dysfunction in schizophrenia, the 
mechanisms through which NMDA hypofunction pro- 
motes PFC dysfunction are poorly understood. 

Metabolomics is the comprehensive analysis of small 
molecule metabolites in biological systems [8]. It involves 
the study of the metabolome which is defined as all of 
the small molecular weight compounds within a sample 
that are required for metabolism, whose roles include 
growth and functionality [9-11]. Sample sources include 
bacteria, parasites, animals and humans and sample types 
can include biofluids, cells or tissue extracts. Metabolo- 
mics can be utilized as a tool for the characterization and 
quantification of all of the metabolites in a biological sys- 
tem. Its applications include profiling disease biomarkers 
[12,13], monitoring disease progression [14], investigating 
xenobiotic metabolism [15], investigating drug-induced 
toxicity [16,17] and investigating metabolism in geneti- 
cally modified animals [18]. Mass spectrometry (MS) has 
been employed extensively as an analytical platform for 
metabolomics studies [19-21]. The popularity of this 
approach has increased over the last decade, in part due 
to the advent of high resolution Fourier transform mass 
spectrometers which offer improved reproducibility, 
accuracy and sensitivity. This makes mass spectrometry 
suitable for high throughput metabolomics studies [22]. 
In addition, the Orbitrap mass spectrometers that are 
now available offer similar performance to FT-MS sys- 
tems without the need for a high strength magnetic field 
[23]. HILIC chromatography has been utilized as a 
separation technique prior to MS detection of polar 
metabolites in aqueous biofluids such as urine, serum 
and plasma [24-30]. 

Additionally, it has also been used for the detection of 
multiple neurotransmitters in primate cerebral cortex 



[31]. HILIC chromatography has been chosen for meta- 
bolomic studies as it is useful for the analysis of highly 
polar metabolites which are poorly retained on reverse 
phase columns [32]. Detailed reviews of the principles 
and applications of HILIC have been previously outlined 
[25,33]. Here, HILIC-chromatography is utilized in com- 
bination with an LTQ-Orbitrap for metabolic profiling 
of metabolite extracts from the PFC of control and 
PCP-treated rats. 

Metabolomics represents a robust approach through 
which alterations in diverse metabolic pathways may be 
determined at a biological systems level. In this way a 
metabolomics approach may prove useful in further elu- 
cidating the pathophysiological mechanisms contributing 
to PFC dysfunction in schizophrenia. Furthermore, this 
approach may also allow for the identification of PFC 
metabolic biomarkers for the cognitive deficits in this 
disorder. While the metabolomics approach can provide 
a rich and comprehensive set of data, the appropriate 
quantitative analysis of this data has not been adequately 
developed. In particular, the identification of statistical 
differences in metabolic pathways between experimental 
groups rather than the identification of statistical differ- 
ences in individual metabolites alone represents a major 
challenge to quantitatively identifying metabolic altera- 
tions at a systems level from metabolomic data. One 
method through which statistical differences in meta- 
bolic pathways can be identified from metabolomic data 
involves the representation of this data as a large, com- 
plex network of nodes (single metabolites) connected by 
real-value edges (the correlation coefficient between two 
metabolites). This form of representation has high face 
validity as the relationship between two metabolites, in a 
given pathway, is governed by a single or series of enzy- 
matic reactions that can be viewed as being represented 
by the correlation between the concentrations of the 
two metabolites. Another advantage is that metabolomic 
data consist of a range of metabolites detected in both 
of the experimental groups of interest meaning that 
these data can be expressed as two complex networks 
based upon the same set of nodes. This data structure is 
amenable to analysis through the application of the 
Generalized Singular Value Decomposition (GSVD) 
algorithm. 

Background in network science and spectral methods 

Large, complex interaction networks arise across many 
applications in science and technology [34-36]. Spectral 
methods, based on information computed from eigen- 
vectors or singular vectors, have been used successfully 
to reveal fundamental network properties. For example, 
we may wish to cluster objects into groups [37], put 
objects into order [38] or discover specific patterns of 
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connectivity within subgroups [39-42]. In this work, we 
loolc at the case where two interaction data sets are 
available and the aim is discover differences between the 
two sets in the form of mutually exclusive clusters. For 
example, a given group of biologically defined entities, 
such as genes, proteins, metabolites or brain regions, 
may contain a subgroup that behaves in a coordinated 
manner under one condition, or in one organism, but 
not in another-the network with respect to one type of 
interaction contains a cluster that is not present in the 
other. We will show that the Generalized Singular Value 
Decomposition, which is becoming more widely used in 
computational biology [43,44] can be justified as the 
basis of a network reordering approach. We also con- 
sider how to quantify the statistical significance of net- 
work patterns that are uncovered. 

Overall, this work develops and applies a novel algo- 
rithm in network science and shows that it reveals 
meaningful insights when applied to cutting-edge meta- 
bolomic data. 

Results 

Derivation of new algorithm 

Suppose that the square, symmetric, real-valued 
matrices A and B in R^"^ represent two different types 
of interaction between a set of N nodes. We have in 
mind the case where the weights play the role of corre- 
lation coefficients. Our aim is to discover clusters, in the 
sense of subsets of nodes that are mutually, pairwise, 
strongly connected through positive weights. The algo- 
rithm will also discover clusters of strong negative con- 
nectivity, although in practice this type of pattern is less 
likely to be present. However, we note that the argu- 
ments given below and the resulting algorithm remain 
valid in the case where the weights are non-negative, 
with zero representing the minimal level of similarity. 
The novelty of our approach is that in order to reveal 
interesting differences between the two types of connec- 
tivity data, we look for a set of nodes that form a good 
cluster with respect to A and a poor cluster with respect 
to B, or vice versa. As a starting point for a computa- 
tional algorithm, we consider the identity 



N N 



N 



(1) 



for X 



\ Here 



1 2 denotes the Euclidean norm 



and deg^ := ^j^j a,^ is one way to generalize the con- 
cept of out-degree to the case of a weighted network. 
Suppose we wish to split the nodes into two groups 
such that nodes within each group are well-connected 
but nodes across different groups are poorly connected. 
We could use an indicator vector x e to denote 



such a partition, with Xs = 1 '\i node s is placed in group 
1 and Xs = -1 if node s is placed in group 2. 

Fixing on two nodes, k and /, we could argue that the 
existence of a third node, such that a/^ and '^a are 
both large and positive or both large and negative is evi- 
dence in favor of placing k and / in the same group 
(since they have in common a strong similarity or dis- 
similarity with node i). On the other hand small or 
oppositely signed values for an^ and a,/ is evidence in 
favor of placing k and / in different groups. In terms of 
the indicator vector, this translates to 

1. anSa large and positive => try to choose Xi^i = +1, 

2. UiiSa small or negative => try to choose 
Returning to the right-hand side of (1), we see that 

Y^^t=\ ^1 '-'^Sfe independent of the choice of indicator 
vector, and 12k=i 12hi i^i ^ik^nxuxi gives a measure 
of how successfully we have incorporated the (possibly 
conflicting) desiderata in points 1 and 2 over all pairs k, 
I and third parties i. So we could judge the quality of an 
indicator vector by its ability to produce a large value of 
IIAxllj, provided other constraints, such as balanced 
group sizes, were satisfied. 
Analogously, we can argue that making 

V^N V^N „N , . 

y ^ 2_^^ ^ 2_^^ ^ ^^aikdiiXkXi as negative as possible is a 

good way to avoid forming well-connected subgroups, 
and so the problem 



\\Ax\\l 

i.eil, l<s<N \\Bx\\l 



max 



(2) 



is a good basis for picking out strong clusters in A 
that are not present in B. 

In general, optimizing over a large, discrete set of pos- 
sibilities is computationally intractable, and hence we 
will follow the widely used practice of relaxing to an 
optimization over R'*^[37,45]. This approach goes back 
as far as the pioneering work of Fiedler [46] and has 
some theoretical underpinning in the case where a sin- 
gle network is analyzed [47,48]. So, instead of (2) we 
have 



\\Ax\\l 

ieR~,x/0 \ \Bx\\l ' 



max 



(3) 



At this stage we recall that a general pair of matrices 
A e R""*" and B e Fi^"" can be simultaneously factor- 
ized using the Generalized Singular Value Decomposi- 
tion (GSVD) into 



A = UCX-^ and B = YSX' 



(4) 



where U e R"'"'" and V e R^'-^ are both orthogonal, 
X e R""" is invertible, C e R'"'^" and S e Jf" are diag- 
onal with nonnegative entries such that C = diag(ci, C2,.., 
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c„) and S = diag(si, S2,—, Sq) with q = min(j?, n), and 0 < 
Ci < C2 < - < c„ and 5i > S2 ^ - ^ Sq > 0 [49]. The ratios 
A, = cjsi are the generalized singular values of A and B. 

A key property of the GSVD is that the columns of X 
are stationary points of the function f :R" R given by / 
= 1 1 Ax II2 l\\Bx 1 1 2, with the generaUzed singular 
values A, giving the corresponding stationary values. 
Hence, we may tackle the problem (3) through the 
GSVD. Columns 1, 2, 3,... of X are candidates for finding 
good clusters in B that are poor clusters in A and, 
analogously, columns N, N - 1, N - 2,... of X are 
candidates for finding good clusters in A that are poor 
clusters in B. 

To transform back from real to discrete domains, we 
may use the ordering of the elements in x to define a 
new ordering for the two networks. More precisely, we 
relabel row and column i of A and B as row and column 
Pi, where 

pi < Pj O Xi < Xj. 

In this way, the existence or lack of clusters in each 
network becomes apparent from inspection of the heat 
map of the reordered matrix. This is the approach that 
we use. We will also show that /7-values can be com- 
puted to quantify the statistical significance of the 
results. The issue of fully automating the choice of clus- 
ter size is left as future work. 

A variant of the algorithm 

In our context, the matrices A and B are square, with m 
= n = p = N. In this case, if we make the additional 
assumption that A and B are invertible it is known that 
the GSVD is closely related to the standard Singular 
Value Decompositions (SVD) of AB'^ and BA'^. To see 
this, we could rearrange (4) into 



AB- 



UCS^^V^ and BA'^ = VSC'^U' . 



(5) 



Alternatively, we may let z = Ax or y = Bx in (3), to 
obtain the quadratic problems 



max ; — 

ze RN.zyo ||BA-lz|| 



or 



max 

ye Rf^.y^O 



IIAB-^xll^ 

llyll^ ' 



which can be solved through the standard SVD. 

It is known from spectral graph theory that the domi- 
nant singular vectors give good directions in which to 
look for clusters [37,50]. Inverting the weight matrix 
reverses their importance (the singular valuea becomes 
(7 ' ) and hence a spectral clustering approach applied 
to A^^ will typically find the opposite of good clusters- 
poorly connected nodes will be grouped together [51]. 
So, intuitively, forming AB' in (5) should produce a 
data matrix for which the SVD approach finds good 



clusters for A and poor clusters for B. Analogously, the 
opposite holds for BA'^. 

Having interpreted the algorithm this way, it is then 
natural to consider the reverse products, A'^B and B'^A, 
or, equivalently, to form the optimization problem 



\\B-'x\\l 

xe'lR"~cVo IIA^l^llj ' 



max 



(6) 



We may interpret (6) from the point of view that 
making B' x large encourages poor clusters for B, while 
making A' x small encourages good clusters for A. In 
this case, we would base our algorithm on the GSVD of 
A'^ and B'\ 

In the situation where A and B are both symmetric, 
corresponding to undirected networks, we have, from 
(4), 

= (A^)-i = [X-^CU^]-^ = UC-^X'^ 

and 



B- 



Then we may appeal to the arguments given pre- 
viously and use columns from the inverse of the third 
factor in the GSVD as the basis for reordering. With 
this approach we use columns of X rather than col- 
umns of X. We emphasize that although this heuristic 
derivation used an assumption that A and B are inverti- 
ble, the GSVD, and hence the final algorithm, applies in 
the non-invertible case. Also, the algorithms that we use 
do not require the computation of matrix inverses. 

In tests on both synthetic and real network pairs, we 
found that this version of the algorithm was more effec- 
tive, [52]. Hence, in this work we focus on the approach 
of reordering networks pairs via columns of X ' .In 
summary, the first few columns of X ' should give 
orderings that favor clusters in B rather than A and vice 
versa for the final few columns. In our computational 
examples, we used the gsvd routine buUt in to MATLAB 
http:/ /www.mathworks.com/. 

Synthetic test on binary networks 

In this section we illustrate the algorithm in a simple, 
controlled case where we know the "correct" answer. 
We begin by considering binary networks, where results 
can be clearly visualized. We generated binary adjacency 
matrices A and B as shown in Figure 1. Here we have 
20 nodes. In both networks, nodes 1-5 are well con- 
nected. In A there is a well connected cluster consisting 
of nodes 6-15, whereas in B there is a well connected 
cluster consisting of nodes 15-20. To make the test 
more realistic, the clusters are not perfect; there are 
both missing edges (false negatives) within the clusters 
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nz ^ 140 nz = 82 

Figure 1 Adjacency matrices for the two synthetic networks 



and spurious edges (false positives) outside the clusters. 
Our aim is to test whether the algorithms can identify 
the clusters that are particular to each data set. We then 
show how statistical significance can be quantified. 

We emphasize that the node labeling in Figure 1 was 
chosen purely to make the inherent structure visually 
apparent. Any spectral reordering algorithm should be 
invariant to a relabeling of the input data. In our con- 
text, this follows from the fact that for any permutation 
matrix P, the factorizations A = UCX and B = V SX ''^ 
are equivalent to PAP'^ = (PU)C{PX) and PBP^ = 
{PU)S{PX) '\ So, on the relabeled data matrices, (PX) 
plays the role that was played by X, and our algorithm 
reorders based on the appropriately permuted columns 
oi X ' , as required. In Figure 2 we show the same two 
data sets with an arbitrary relabeling in order to illus- 
trate that the inherent structure is no longer apparent. 
In essence, we are hoping that the algorithm will find 
the structure that has been buried in Figure 2. 

In Figure 3 we display the two adjacency matrices 
reordered with the algorithm; we show reordering with 
eight different columns of X ' , four from each end of 
the spectrum. We see that mutually exclusive structures 
have been uncovered. The reordering from the first col- 
umn begins with nodes 18, 20, 16, 15, 19, 17, which 
form a cluster in B, but not A. The final column begins 
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Figure 2 Relabeled versions of the synthetic networks In 
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by picking out nodes 7, 9, 10, 15, 14, 11, 6, 13, which 
form the bulk of the 6-15 cluster in A. Nodes 8 and 12, 
which are missing from this sequential ordering, are 
placed at the head of the ordering in the penultimate 
column, which begins 12, 8, 7, 10, 15, 14, 9, 11. So in 
summary, the 19th and 20th columns of X ' each 
reveal almost complete information about the exclusive 
cluster in A, and between them they capture the full 
cluster. 

Cluster validation 

Suppose we find r nodes giving a good cluster s for B 
but a poor cluster for A when the graphs are reordered 
by column v from X'^ . Is this type of substructure likely 
to arise "by chance"? The following general approach 
can be used in order to determine a p -value, where we 
will regard a value below 0.05 as indicating statistical 
significance. 

Initialization: Compute a measure of cluster quality, c 
{A, B), for the promising substructure consisting of 
those T nodes in networks A and B reordered by column 

V. 

Step 1: Randomize the networks and obtain new data 
sets ^ and g. 

Step 2: Compute the GSVD for the randomized net- 
works }^ and Q and obtain a matrix x~^. 

Step 3: Compute the measure c{A, B) for the t node 
'cluster' in ^ and g reordered by column v from x~^. 

p -value After performing M loops over Steps 1 to 3, 
compute a p -value as the proportion of c{A, B) samples 
that exceed c{A, B). 

For our cluster quality measure c{A, B) we used 

(density of edges within the cluster inij)/(density of edges outside die cluster inS) 
(density of edges within the cluster irL4)/{ density of edges outside the cluster iuA) 

For these binary graphs, the density/ (s) of cluster s 
was defined as 

m-'^. (7) 

Here, |£(s)| represents the actual number of edges in 
the object block s, and \s\ is the maximum possible 
number of edges. 

For weighted graphs, in the case where the cluster is 
dominated by positive weights, we will generalize this to 

Here, w(s) denotes the average weight in block s. We 
note the denominator |5| cancels when ratios are com- 
puted in the p-vahie algorithm. 

In Figure 3, we see that eight nodes 
7,9,10,15,14,11,6,13 form a cluster in A, but not in B, 
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Figure 3 Networks reordered using columns of X ' 



when the synthetic data is reordered with the final col- 
umn of X Applying the procedure above, using per- 
mutation to randomize the networks M = 1000 times as 
described below, we obtained a p-vahie of 0.007. Apply- 
ing the same procedure, we also obtained a p-value of 



0.029 for the first 6 nodes 18, 20, 16, 15, 19, 17 when 
the synthetic data is reordered with the first column of 
X ' , which visually form a cluster in B, but not in A. 
These /»-values (< 0.05) both indicate that the results are 
statistically significant. As a further test, we arbitrarily 
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selected the subnetworks of A and B composed of nodes 
2,4,12,16,1,3,18, which correspond to the 12th to 18th 
components of the sorted final column from X In 
this case, we would not expect to find a significant 
result. This is reflected in the large j5-value of 0.844. In 
more exhaustive experiments, three randomization 
methods were tested [52]: 

♦ Erdos-Renyi: generate a classical random graph with 
the appropriate number of edges. 

« Redistribution: redistribute the entries in each row 
and each column of A, and perform the same operations 
on B. 

« Permutation: reorder the nodes in A and B and 
choose the first r nodes in this new ordering. In this 
case, recomputation of the GSVD in Step 2 is not neces- 
sary, due to the permutation invariance of the 
factorization. 

Of those three approaches, Erdos-Renyi may be the 
most commonly used method to randomize a binary 
network, whereas permutation extends most naturally to 
the case of weighted edges, so we used permutation in 
the test shown here. We also tested another simple clus- 
ter quality measure which is the ratio of density of edges 
within the cluster in one graph and that in the other 
graph. 

These variations were studied within this general 
methodology on both real and synthetic data sets [52]. 
In all cases, comparable p -values were produced. 

Synthetic test on correlation networks 

Having tested the algorithm on binary networks, we 
now consider the case where weighted edges arise as 
correlation coefficients. 

First, we generate two correlation matrices A and B as 
shown in Figure 4. Here, each graph has 20 nodes, and 
each entry is real valued, representing the correlation 
coefficient between the corresponding nodes. The same 
cluster patterns given for the synthetic binary matrices 
in Figure 1 were built in to the synthetic correlation 
data: nodes 1-5 are well connected in both networks; in 



A there is a well connected cluster consisting of nodes 
6-15, whereas in B there is a well connected cluster con- 
sisting of nodes 15-20. Some noise was added to the 
clusters to make this test more realistic. 

More precisely, in our computation, the value of 
each entry (the correlation coefficient) in A and B as 
shown in Figure 4 is generated from a pair of 20 x 50 
rectangular matrices and Dt- The corresponding 
cluster patterns are built from signals. Figure 5 shows 
the nine signals that take part in the data. These are 
row vectors with 50 elements. We use v'^', v'^', v'' 



V ' to denote them. From these signals, we set up two 
matrices 



,20x50 



: the first 5 rows are linear combina- 



tions of v'^l, v^^\ v'*^! and v'^'. Rows 6 

to 15 are combinations of v'^' and v'*'. The remain- 
ing rows (rows 16 to 20) are Gaussian pseudoran- 
dom numbers. 

» Dh G R^"*'^": the first 5 rows are linear combina- 
tions of v'^l, v'^l, v'^l, v^^\ v'^l and v'^'. Rows 6 
to 14 are Gaussian pseudorandom numbers. The 
remaining rows (rows 15 to 20) are combinations of 
vl^l and vl"!. 

Building up the rows from the underlying signals in 
this manner allowed us to construct the correlation pat- 
terns seen in Figure 4. 

Although the algorithm is invariant to permutation, 
for visual clarity, we also shuffled the synthetic correla- 
tion data sets A and B before applying our algorithm to 
them. Figure 6 shows the same synthetic correlation 
data sets with an arbitrary relabeling. 

We present the results from our algorithm in Figures 
7 and 8. We show the relabeled A and B reordered with 
two extreme columns of X ' , one from each of the two 
ends of the spectrum. The reorderings reveal the 
mutually exclusive cluster structures of A and B. We 
also applied the cluster validation method to the struc- 
tures uncovered by the reorderings using random 




0 8 
0.6 



I 



Figure 4 The original synthetic correlation data. 
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Figure 5 The nine signals. 
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permutation. In Figure 7 we see that the first column of not in B was completely uncovered by the nodes 10, 14, 

X picks out the continuous nodes 17, 15, 20, 18, 16, 12, 9, 8, 7, 6, 13, 15, 11 (at the top left hand side of the 

19, which form a good cluster in B but not in A {p < heatmaps, p < 0.001). 

0.001). The reordering from the final column of X ' In summary, this additional synthetic test illustrates 

shown in Figure 8 reveals that the 6-15 cluster in A but that our GSVD based algorithm can be extended to 
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Figure 6 Relabeled versions of the synthetic networks in Figure 4 
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B reordered by column 1 






Figure 7 The synthetic correlation data reordered with the first column from X ' 



reveal the pattern difference between two relative corre- 
lation matrices in terms of clustering. 

Quantitative determination of metabolic pathways 
disrupted in the prefrontal cortex of PCP-treated animals 

SIEVE analysis (Thermo-Fisher Scientific) revealed sig- 
nificant PCP-induced alterations in the level of specific 
metabolites in the PFC of PCP-treated rats (Table 1 
Additional File 1). These changes were evident in multi- 
ple metabolic pathways as defined by the Kyoto Ency- 
clopedia of Genes and Genomes (KEGG) metabolite 
pathways database. Significant changes were evident in 
(i) glutamate metabolism (3 metabolites [m, n]), (ii) the 
alanine, aspartate and glutamate pathway (2 metabolites 
[n]), (iii) phenylalanine, tyrosine and tryptophan meta- 
bolism (3 metabolites [a]), (iv) purine metabolism (2 
metabolites [o]) and (v) butanoate metabolism (2 meta- 
bolites [k]). This suggested that these metabolic path- 
ways are disrupted in the PFC of PCP-treated animals. 



However, this simple level of analysis prevents any 
quantitative and statistically rigorous determination of 
the predefined (KEGG) metabolic pathways disrupted in 
the PFC of PCP-treated animals. 

In the context of this study the aim of applying the 
GSVD algorithm to metabolomic data from control and 
PCP-treated animals was to quantitatively determine 
which predefined metabolic pathways were altered in 
PCP-treated animals. The inter-metabolite Pearson's 
correlation coefficient (partial correlation) was used as 
the metric of the functional association between each 
pair of metabolites and was generated from the metabo- 
lite peak intensities, as determined by Liquid Chromato- 
graphy Mass Spectrometry (LC-MS), across all animals 
within the same experimental group (i.e. either control 
or PCP-treated). These correlations were Fisher trans- 
formed to give the correlation data a normal distribu- 
tion. This resulted in a pair of symmetric, square, real- 
valued {98 X 98} partial correlation matrices (Control 
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Table 1 PCP-induced alterations In PFC metabolite levels as determined by SIEVE analysis 



Formula 


Metabolite 


Metaboite 
KEGG ID 


KEGG 
Pathways 


P -value 


Ratio 


CgH] 1 /VO3 


L-Tyrosine 


C00082 


ko00350, koOOodO, ko00400 


0.001 


0.584 




gamma G utamylg utamine 


NA 


NA 


0.007 


0.673 




L-Citrulline 


c00327 


koUUjjU 


0.007 


0.709 


C^HyN02S 


L-Cysteine 


cUuuy/ 


i,.^riAi/"r\ \,r^r\r\~)~7r\ \,^r\r\A3r\ L^AA/ior\ L^rirmA \,^r\r\~7~7r\ \,^r\r\r\'^r\ 
KOUUioL), KOL)0z/U, KOO(J43U, kol)U4oU, KOOO/3U, KO(J(J//U, K0UU9z(J 


0.01 2 


0.445 


CsHgNO 


2-Pheny acetamide 


c02505 


KoUOJoU 


0.01 5 


0.561 


r N n 


r 1 itri ly 1 py I U Vd Lc 


rOni 1^1^ 
LUU 1 00 




0 016 


0 57 


C4H6O2 


2,3-Butanedione 


C00741 


map00650 


0.017 


0.786 


C4H5N3O 


Cytosine 


C00380 


ko00240 


0.019 


0.665 


CmHgN02 


GABA 


C00334 


ko00250, ko00330, ko00410, ap00650, ko04080 


0.021 


0.804 


CgH.yNO^ 


0-Acetylcarnitine 


C02571 


ko00250 


0.022 


2.649 


C.^H.sNsOuP 


Adenylosuccinate 


C03794 


ko00230, ko00250 


0.029 


3.276 


C5H5N5O 


Guanine 


C00242 


ko00230 


0.035 


0.593 




Carnitine 


C00487 


ko00310 


0.037 


0.819 



Table 1 shows the molecular formula, tentative metabolite identity and the KEGG pathways in which a metabolite is involved. Only metabolites found to be 
significantly different between the two experimental groups by SIEVE analysis (see Methods section) are shown. Full data for all metabolites detected in the PFC 
of control and PCP-treated rats are shown in Table SI (Additional File 1). The most prominent alterations in KEGG defined metabolic pathways appeared to be in 
(i) alanine, aspartate and glutamate metabolism (3 metabolites [ko00250]), (ii) phenylalanine, tyrosine and tryptophan metabolism (3 metabolites [ko00360]), (iii) 
purine metabolism (2 metabolites [ko00230]} and (iv) butanoate metabolism (2 metabolites [ko00650]). KEGG defined metabolic pathways; ko00250: Alanine, 
Aspartate and Glutamate metabolism; ko00330: Arginine and Proline metabolism; ko00410: beta-Alanine metabolism; map00650: Butanoate metabolism; ko00270: 
Cysteine and Methionine metabolism; k00480: Glutathione metabolism; ko00260: Glycine, Serine and Threonine metabolism; ko00430: Methionine metabolism; 
ko04080; Neuroactive ligand-receptor interaction; ko00770: Pantoate and CoA biosynthesis; ko00350: Phenylalanine metabolism; ko00400: Phenylalanine, Tyrosine 
and Tryptophan biosynthesis; ko00230: Purine metabolism; ko00240: Pyrimidine metabolism; ko00920: Sulphur metabolism; ko00430: Taurine and Hypotaurine 
metabolism; ko00730: Thiamine metabolism; ko00350: Tyrosine metabolism; ko00400: Tyrosine and Tryptophan biosynthesis. NA denotes a metabolite not 
associated with a KEGG compound ID or KEGG pathway. 



animals: Additional File 2 PCP-treated animals: Addi- 
tional File 3). Each within-group matrix represents the 
specific association strength between each of the 9506 
possible pairs of metabolites in that experimental group. 
In the simplest biological case the correlation coefficient 
between two metabolites (nodes) in the matrix repre- 
sents the series of enzymatic reactions responsible for 
converting one metabolite into another. However, it 
should be noted that this simple interpretation does not 
account for the complex relationships that may influ- 
ence the correlation between two metabolites, such as 
the involvement of metabolites in alternative, often par- 
allel, metabolic pathways. There are important limita- 
tions that must be recognized when modeling 
metabolomic data as a complex network of interactions 
between metabolites (as defined by the correlation that 
exists between them) such as the potential for correla- 
tions to exist between metabolites that are not biologi- 
cally relevant. The impact of such erroneous 
associations on the interpretation of the data as outlined 
in this paper will be limited by the approach of charac- 
terizing alterations at the level of metabolic pathways, 
involving multiple metabolites (the approach taken in 
this study), rather than considering the disruption of 
single correlation coefficient between two metabolites. 

Our network treats interactions between molecules as 
bidirectional, and so the set of interactions between 



molecules forms an undirected weighted network. In 
essence the GSVD algorithm allows the reordering of 
the two experimental matrices A (control animals) and 
B (PCP-treated animals) with the aim of discovering a 
new node (metabolite) ordering that reveals clusters of 
nodes that exhibit strong connectivity (mutual weights) 
in one network but not the other. In the context of this 
data the GSVD algorithm was used to identify clusters 
of metabolites present in one experimental group that 
are not present in the other with the aim of identifying 
those metabolic pathways in the PFC disrupted by PCP 
treatment. Once the matrices had been reordered 
through the GSVD algorithm the significant presence of 
a cluster in the given network was statistically tested by 
comparison of the cluster quality measure in the real 
networks relative to that in 1000 random permutations 
of the initial matrices. The original metabolomic net- 
works are shown in Figure 9, where matrix A represents 
control animals and B represents PCP-treated animals. 
Figures 10 and 11 show the networks reordered by the 
first and the final column of X , respectively. The ori- 
ginal position of each metabolite detected by LC-MS 
(Figure 9) and its new position in each of the reordered 
matrices (Figures 10 and 11) are shown in Additional 
File 4. Visually, in Figure 10 there was no obvious pat- 
tern of clustering that would identify significant clusters 
of metabolites present in PCP-treated animals that were 
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not present in controls. In contrast, in Figure 11 there 
appeared to be clusters of metabolites present in the 
PFC of control animals that were not present in PCP- 
treated animals (top left and bottom right hand side of 
the heatmap). For Figure 11 the significance of the top 
cluster (first 22 nodes in the reordering, p < 0.001) and 
the bottom cluster (last 18 nodes in the reordering, p < 
0.001) was confirmed, indicating that there were clusters 
of metabolites significantly present in control (A) ani- 
mals that were not present in PCP-treated (B) animals. 
The identity of the metabolites, the KEGG pathways in 
which each metabolite is involved, and the PCP-induced 
alteration in the overt level of each metabolite (as deter- 
mined by SIEVE analysis) are shown in Tables 2 and 3 
for the top and bottom cluster, respectively. In contrast 
to the metabolite clustering shown in Figure 11 there 
was no evidence in Figure 10 for any significant cluster 
of metabolites present in PCP-treated animals (B) that 
was not present in control (A) animals: (i) potential top 
cluster [first 10 nodes] p = 0.421; (ii) potential middle 
cluster [nodes 18-25] p = 0.494. 



Rigorous significance testing, involving multiple poten- 
tial metabolite clusters, confirmed that there were no 
significant clusters of metabolites in PCP-treated ani- 
mals that were not present in controls (Figure 10). Fol- 
lowing significance testing of potential metabolite 
clusters in the GSVD reordered matrices, hypergeo- 
metric probability (described in the Methods section) 
was applied to test the significance of KEGG defined 
metabolite pathway over-representation in these clusters. 
The results for hypergeometric probability testing are 
shown in Tables 4 and 5. 

Discussion 

Through its application to metabolomic data we have 
clearly demonstrated the added value that can be gained 
from applying the GSVD algorithm to two sets of com- 
plex, network data based upon the same set of nodes. In 
particular, we have demonstrated that the combined 
application of the GSVD algorithm with hypergeometric 
probability analysis provides an analytical framework by 
which statistical alterations in predefined metabolic 
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Figure 10 Control {A) and PCP (B): reordered with the first column from X 
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Table 2 Metabolite Identities and their relevant KEGG pathways in the top cluster of Figure 1 1 



Formula 


Metabolite 


Metaboite 
KEGG ID 


KEGG 
Pathways 


P 

-value 


Ratio 




L-Glutamine 


C00064 


K0OO23O, ko00240, ko00250, I<o00330 


0.522 


0.959 


H3PO4 


Phosphoric acid 


C00009 


ko00190 


0.254 


0.915 


C5H7/VO3 


1-Pyrroline-4-hydroxy-2- 
carboxylate 


C04282 


ko00330 


0.781 


0.981 


C4H9/V3O2 


Creatine 


C00300 


ko00330, ko00260 


0.551 


0.953 


C4H9NO2 


GABA 


C00334 


ko00250, ko00330, ko00410, ko04080, map00650 


0.021 


0.804 


C4H7/VO4 


L-Aspartate 


C00049 


ko00250, ko00250, ko00270, map00300, ko00330, ko00340, ko00410, 
ko00760, ko00770, ko04080 


0.319 


0.916 


C4H7/VO2 


1-Aminocyclopropane-l- 
carboxylate 


C01234 


ko00270, ko00640 


0.590 


0.951 


C5H5N5O 


Guanine 


C00242 


ko00230 


0.035 


0.593 


C5H9/VO4 


Glutamate 


C00025 


ko00250, ko00330, ko00340, ko00471, ko04080, ko00480, map00650 


0.845 


0.985 




Hydroxymethylpropanitrile 


NA 


NA 


0.098 


0.842 




Nicotinamide 


C00153 


ko00760 


0.440 


0.917 


C4H6O2 


2,3-Butanedlone 


C00741 


mapOOeSO 


0.017 


0.786 


QHvA 


Pantoate 


C00552 


ko00770 


0.722 


0.963 


C,sH23NsOuP2 


ADP-ribose 


cOOBOl 


ko00230 


0.058 


677.029 


C^HyNO^ 


L-Serine 


C00065 


ko00260, ko00270, ko00600, ko00920, ko00680 


0.316 


0.856 


C4H5N30 


Cytoslne 


C00380 


ko00240 


0.019 


0.665 


CjHjNO^S 


Taurine 


C00245 


ko00430, ko04080 


0.936 


0.995 


C4H5/VO3 


Maleamate 


C01596 


ko00760 


0.372 


0.927 




Ethanolamine phosphate 


C00346 


ko00260, ko00564, ko00600 


0.373 


0.889 




Unl<nown ID 


NA 


NA 


0.271 


1.395 


CsH^NO^ 


Hydroxyvaline 


NA 


NA 


0.585 


0.946 




L-Cltrulline 


C00327 


ko00330 


0.007 


0.709 



Table 2 shows the top cluster of metabolites identified by the GSVD algorithm that are present in the PFC of control but not PCP-treated animals {Figure 1 1). The 
molecular formula, tentative molecular identity, its KEGG compound identity and the KEGG metabolic pathways in which a given metabolite is involved are also 
shown. The key for each KEGG pathway identity is shown in Table 4. The p -values and ratio change reported for each metabolite in this table are those 
calculated by SIEVE analysis. Those metabolites found to be significantly different between the two groups by analysis are highlighted in bold. While SIEVE 
analysis fails to attribute significance (p < 0.05) to PCP-induced alterations in the overt concentration of many metabolites in this cluster, GSVD analysis reveals 
that the relationship between these metabolites is significantly altered by PGP treatment (p < 0.001), highlighting the specific metabolic pathways that may be 
disrupted in the PFC of PCP-treated animals. The most prominent alterations in KEGG defined pathways in this cluster were in (i) Arginine and Proline 
metabolism {7 metabolites [ko00330]) (ii) Glycine, Serine and Threonine metabolism (3 metabolites [ko00260]) and (iii) KEGG defined neuroactive ligands (4 
metabolites [ko04080]). 



Xiao et al. BMC Systems Biology 201 1, 5:72 
http://www.biomedcentral.eom/1752-0509/5/72 



Page 1 3 of 20 



Table 3 Metabolite Identities and their relevant KEGG pathways in the bottom cluster of Figure 1 1 



Formula 


Metabolite 


Metaboite 
KEGG ID 


KEGG 
Pathways 


P -value 


Ratio 


C5H4/V4O2 


Xanthine 


C00385 


ko00230 


0339 


0.508 




Gamma- 
glutamylglutamic acid 


NA 


NA 


0.143 


0.54 




Myristoleic acid 


c08322 


NA 


0.689 


0.623 


CsH^N^O 


Hypoxantliine 


c00252 


K0OO230 


0.1 1 5 


0.569 


C17H37NO2 


Heptadecasplninganine 


NA 


NA 


0.733 


0.769 


C-ioi-iTiN^OsP 


Inosine monophospliate 


cOOl 30 


K0OO230 


0.461 


0.73 




Peptide fragment (Arg-Arg-GIn) 


NA 


NA 


0.775 


1 .1 83 


C6H15WO3 


Triethanolamine 


c05771 


K0OO554 


0.691 


1 .207 




Carnosine 


c00385 


K0OO34O, I<o00410 


0.872 


1 .1 28 


C10H12N4O5 


Inosine 


C00294 


K0OO230 


0.090 


0.6 




Kl a ci n on i n 
l\dMycIllM 


Luujuy 


NA 


0 1 96 


0 862 


C,oH„N,Oe 


gamma Glutamylglutamine 


NA 


MA 


0.007 


0.673 


C26H42/V7O20P3S 


2-iHyd roxyg 1 uta ry l-Co A 


C030S8 


map00650 


0.179 


0.715 


C3lH54/V70,7P35 


Decanoyl-CoA 


C05274 


ko00071 


0410 


1312 




2- Aminoethylphosphodnolate 


C05683 


ko00440 


0.243 


0.662 




Eudesmin 


NA 


NA 


0.084 


0.493 


C^HjNOjS 


L-Cysteine 


C00097 


ko00260, ko00270, ko00430, ko00480, ko00730, ko00770, map00920 


0.012 


0.445 


OiHjOeP ) 


Glycerone phospliate 


cOOl 1 1 


koOOOlO, koOOOSl, ko00052, ko00561, ko00562, ko00554, ko00620 


0.063 


0381 



Table 3 concerns the bottom cluster of metabolites identified by the GSVD algorithm that are present in the PFC of control animals but not PCP-treated animals. 
The molecular formula, tentative molecular identity, its KEGG compound identity and the KEGG metabolic pathways in which a given metabolite is involved are 
shown. The identity of each KEGG pathway ID is shown in Table 5. The p -values and ratio change reported for each metabolite in this table are those calculated 
by SIEVE analysis. Those metabolites found to be significantly different between the two groups by analysis are highlighted in bold. While SIEVE analysis fails to 
attribute significance (p < 0.05} to PCP-induced alterations in the overt concentration of many metabolites for many metabolites in this cluster, the PCP/Control 
ratio suggests that the levels of many of these metabolites are markedly altered by PCP-treatment. GSVD analysis reveals that the relationship between the levels 
of these metabolites in this cluster are significantly altered by PCP-treatment (p < 0.001) highlighting specific metabolic pathways that may be disrupted in the 
PFC of PCP-treated animals. There appears to be an overabundance of Purine (4 metabolites [ko00230]) and Glycerophospholipid {2 metabolites [ko005641} in the 
bottom cluster. 



pathways between experimental groups can be defined 
from complex metabolomic data. There is a great unmet 
need for this type of analytical approach in metabolo- 
mics, as well as in the other -omics fields (e.g. transcrip- 
tomics), which allows the quantification of alterations at 
the biological systems (pathways) level rather than sim- 
ply identifying significant alterations of discrete mea- 
sures (i.e. single metabolites). 

Through the application of this analytical approach we 
identified statistically significant alterations in specific, 
pre-defined metabolic pathways (KEGG database path- 
ways) that may contribute to PFC dysfunction in PCP- 
treated animals, and so in schizophrenia. This included 
the disruption of the (1) Alanine, Aspartate and Gluta- 
mate [ko00250], (2) Arginine and Proline [ko00330], (3) 
Butanoate [ko00650], (4) Nicotinate and Nicotinamide 
[ko00760], (5) Glycine, Serine and Threonine metabolic 
pathways as well as an imbalance in (6) metabolites 
active as neurotransmitter ligands [ko04080]. The dis- 
ruption of metabolic pathways involving glutamate in 
the PFC of PCP-treated rats seems particularly pertinent 
given the reported alterations in extracellular glutamate 



availability in the PFC following repeated PCP treatment 
[53] and the central hypothesis of hypofunctional gluta- 
matergic PFC neurotransmission in schizophrenia 
[54,55]. In addition to altered glutamate metabolism 
there was also evidence to support an imbalance in mul- 
tiple metabolites known to be active at glutamate recep- 
tors. This included an imbalance in the relationship 
between glutamate, L-aspartate and Tauring (Table 2) 
which are all known to be active at glutamate receptors. 
Furthermore, evidence for the disruption of glycine, ser- 
ine and threonine metabolism may suggest that glycine 
and serine activity as co-agonists at the NMDA recep- 
tors may be disrupted in the PFC of PCP-treated ani- 
mals. However, it is important to note that we failed to 
detect glycine levels in our samples and serine levels 
appear to be overtly unchanged. The possibility of 
altered glycine levels in the PFC of PCP-treated rats 
warrants further investigation given the ability of glycine 
and NMDA receptor glycine site agonists to reverse 
subchronic PCP-induced alterations in PFC dopaminer- 
gic neurotransmission [56,57], which may be central to 
the impact of subchronic PCP treatment on cognition. 
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Table 4 Hypergeometric probability of KEGG defined metabolic pathways in the top cluster in Figure 1 1 



KEGG Path-way 
Identity 


KEGG Pathway 


Number of metabolites 
in cluster(A) 


Total number of pathway 
metabolites detected (B) 


Hypergeometric 
Probability (P (X) > k) 


ko00190 


Oxidative phosphorylation 


1 


1 


0.224 


ko00230 


Purine metabolism 


3 


13 


0.598 


ko00240 


Pyrimidine metabolism 


2 


6 


0.406 


ko00250 


Alanine, Aspartate and 


4 


7 


0.043 



Glutamate metabolism 



ko00260 


Glycine, Serine and 
Threonine metabolism 


4 


7 


0.043 


ko00270 


Cysteine and Methionine 
metabolism 


3 


7 


0.186 


map00300 


Lysine biosynthesis 


1 


3 


0.538 


ko00330 


Arginine and Proline 
metabolism 


7 


10 


0.001 


ko00340 


Histidine metabolism 


2 


5 


0.312 


ko00410 


beta-Alanine metabolism 


2 


5 


0.312 


ko00430 


Taurine and Hypotaurine 


1 


3 


0.538 



metabolism 



ko00471 


D-glutamine and D-glutamate 
metabolism 


1 


1 


0.224 


ko00480 


Glutathione metabolism 


1 


5 


0.728 


ko00564 


Glycerophospholipid metabolism 


1 


1 1 


0.949 


koooeoo 


Sphingolipid metabolism 


2 


3 


0.126 


ko00640 


Propanoate metabolism 


1 


2 


0.400 


map00650 


Butanoate metabolism 


3 


4 


0.034 


ko00680 


Methane metabolism 


1 


1 


0.224 


ko00760 


Nicotinate and Nicotinamide 


3 


4 


0.034 



metabolism 



ko00770 Pantothenate and CoA 2 5 0.312 

biosynthesis 



ko00920 


Sulphur metabolism 


1 


3 


0.538 


ko04080 


Neuroactive ligand-receptor 
interaction 


4 


7 


0.043 



Table 4 shows the hypergeometric probability of at least the observed number of metabolites arising by chance for a given KEGG defined metabolic pathway in 
the top cluster of Figure 1 1, identified through the GSVD algorithm as being present in control but not PCP-treated animals. Further computational details are 
given in the IVlethods section. The cluster size was 22 metabolites from a total population of 98. There was a significant over representation of metabolites of (i) 
Alanine, Aspartate and Glutamate metabolism [ko00250], (ii) Arginine and Proline metabolism [ko00330], (iii) Butanoate metabolism [ko00650], {iv} Nicotinate and 
Nicotinamide metabolism [ko00760], (v) Glycine, Serine and Threonine metabolism and (vi) those metabolites active as neurotransmitter ligands [ko04080] (all 
highlighted in bold) suggesting that these pathways are disrupted in the PFC of PCP-treated animals. 



Altered glycine, serine and threonine metabolism in the 
PFC of PCP-treated animals is also consistent with the 
hypothesis that glycine and serine represent potential 
therapeutic targets for the treatment of schizophrenia 
[58]. In addition, we found evidence to suggest that 
GABA neurotransmission was also significantly 
decreased in the PFC of PCP-treated rats, which may 
relate to the compromised integrity of GABAergic inter- 
neurones in these animals [3,6], which closely resemble 
the GABAergic interneuron alterations seen in schizo- 
phrenia. The imbalance in glutamate, glutamine and 
GABA levels identified in the PFC of PCP-treated rats 
may directly contribute to the hypofrontality (glucose 
hypometabolism) seen in these animals, as detected 



using the C-2-deoxyglucose imaging technique [4], as 
all of these metabolites are intimately linked through 
metabolic pathways and have a central role in regulating 
the coupling of neuronal activity to cerebral glucose 
metabolism [59,60]. 

Our results also suggest that glutamatergic dysfunction 
in the PFC of PCP-treated rats is not limited to the disrup- 
tion of glutamatergic neurotransmission but also involves 
the disruption of the metabolic pathways in which glutat- 
mate is involved. For example, altered glutamate metabo- 
lism may directly contribute to the disruption of the 
Arginine-Proline metabolic pathway in the PFC of PCP- 
treated animals. The significant disruption of the Arginine 
pathway in PCP-treated animals suggests that prolonged 
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Table 5 Hypergeometric probability of KEGG defined metabolic pathways in bottom cluster in Figure 1 1 



KEGG Path-way 
Identity 


KEGG Pathway 


Number of metabolites 
in cluster(A) 


Total number of pathway 
metabolites detected (B) 


Hypergeometric 
Probability (P [X) > k) 


koOOOlO 


Glycolysis/Gluconeogenesis 




1 


0.184 


KOUUUd I 


Fructose and Mannose 
metabolism 




1 
1 


U. 1 o4 


ko00052 


Galactose metabolism 


1 


1 


0.184 


ko00071 


Fatty acid metabolism 




1 


0.184 


KOUUzJU 


Purine metabolism 




1 3 


0.1 91 


ko00260 


Glycine, Serine and Threonine 
metabolism 


1 


7 


0.770 


KOUUz/U 


Cysteine and Methionine 
metabolism 




7 


U./ /U 


ko00340 


Histidine metabolism 




5 


0.646 


K0UU4 1 U 


beta-Alanine metabolism 




r 
J 


U.d4d 


ko00430 


Taurine and Hypotaurine 
metabolism 


1 


3 


0460 


K0UU44U 


Phosphonate and 
Phosphinate metabolism 




n 

Z 


U.DDD 


ko00480 


Glutathione metabolism 


1 


5 


0.646 


ko00561 


Glycerolipid metabolism 




2 


0.335 


KOUUjoz 


Inositol Phosphate 
metabolism 




2 


U.33J 


ko00564 


Glycerphopholipid 
metabolism 




11 


0.642 


ko00620 


Pyruvate metabolism 




2 


0.335 


map00650 


Butanoate metabolism 




4 


0.562 


ko00730 


Thiamine metabolism 




1 


0.184 


ko00770 


Pantothenate and CoA 
biosynthesis 




5 


0.646 


map00920 


Sulphur metabolism 


1 


3 


0460 



Table 5 shows the hypergeometric probability of randomly seeing at least the observed number of metabolites of a given KEGG pathway in the bottom cluster 
of Figure 11, identified though the GSVD algorithm as being present in control animals but not in PCP-treated animals. There was no evidence for a particular 
over-abundance of metabolites from any given KEGG pathway in this cluster. Cluster size is 18 metabolites from a total population of 98. 



results suggest that the GSVD algorithm can identify dis- 
crete series of metabolic reactions altered by experimen- 
tal manipulation. In this way, while we found no 
significant evidence to support the widespread disruption 
of purine metabolism, or the significant disruption of any 
other KEGG defined metabolic pathway in the bottom 
cluster as detected using the GSVD, we did find evidence 
in this cluster to suggest that a specific series of purine 
reactions were significantly disrupted in the PFC of PCP- 
treated animals. These disrupted purinergic reactions in 
the PFC of PCP-treated animals were: 

Xanthine* ^ Hypoxyanthine* -o- Inosine* ^ IMP* adenylosuccinate* 

*denotes significantly increased levels in the PFC of PGP - treated rats (SIEVE analysis) 
*denotes series of reactions disturbed in the PFC of PCP - treated rats(GSVD analysis) 

This result suggests that the activity of adenylosucci- 
nate synthase (ADSS), the enzyme responsible for the 
conversion of IMP to adenylosuccinate, may be signifi- 
cantly increased in the PFC of PCP-treated animals. An 
increase in the functional activity of this enzyme could 



NMDA receptor hypofunction may result in disrupted 
nitric oxide (NO) signalling in the PFC. There is increasing 
evidence that NO signalling is directly linked to NMDA 
receptor activity through regulation of the enzyme nitric 
oxide synthase (NOS) [61] and that NO signaling contri- 
butes to the deficits in cognition that arise from acute 
NMDA receptor blockade [62,63]. The finding that Citrul- 
line levels, a metabolite in the Arginine-Proline pathway, 
are significantly decreased in the PFC of the PCP-treated 
rats in this study further supports the suggestion that 
NOS activity is altered in the PFC of these animals, as this 
metabolite is formed by NOS when it releases NO from L- 
arginine. This suggests that NMDA receptor hypofunction 
may underlie the decreased NOS activity and protein 
expression levels reported in the PFC of schizophrenia 
patients [64] and may contribute to the cognitive deficits 
seen in this disorder. 

In addition to quantitatively defining the specific meta- 
bolic pathways altered by experimental manipulation, our 
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result in both the increased level of adenylosuccinate 
and the altered balance in the enzyme's downstream 
metabolites (IMP, Inosine, Hypoxanthine, Xanthine) 
seen in the PFC of PCP-treated animals. While the 
influence of prolonged NMDA receptor hypofunction 
on the functional activity of this specific enzyme 
remains to be confirmed, and clearly warrants further 
systematic investigation, the recent finding of altered 
ADSS gene expression in schizophrenia [65] and the 
association of ADSS gene polymorphisms with schizo- 
phrenia [66] further highlights a potential role for this 
metabolic pathway in this disorder. In addition, a role 
for this metabolic pathway in cognition and schizophre- 
nia is supported by the observation that inherited defi- 
ciency in the enzyme responsible for the breakdown of 
adenylosuccinate (ASL) results in mental retardation 
and autistic features [67,68]. Furthermore, the ASL gene 
maps to chromosome 22^13.1-^13.2 in humans [69] and 
these chromosomal loci have been repeatedly linked to 
schizophrenia [70-72]. The disruption of this metabolic 
pathway may also contribute to the reduced rate of cer- 
ebral glucose metabolism in the PFC of PCP-treated ani- 
mals [3,4] as ASL deficiency results in hypometabolism 
in frontal cortical structures [73]. Overall, these results 
suggest that the potential role of this specific series of 
metabolic reactions and its enzymes in cognition and 
schizophrenia warrants further investigation. 

Conclusions 

This work addresses the scenario where a pair of net- 
works describes two different patterns of connection 
between a common set of nodes. We argued from first 
principles that the Generalized Singular Value Decom- 
position (equation (4)) can form the basis of a very 
useful computational tool. In practice, we have shown 
that this new computational network reordering tech- 
nique was able to identify alterations in metabolic 
pathways in the PFC of rats treated subchronically 
with PCP that may contribute to the PFC dysfunction 
and cognitive deficits seen in these animals. Further- 
more, the metabolic pathways identified as being dis- 
rupted in the PFC of PCP-treated rats trough the 
application of this new computation technique clearly 
overlap with those metabolic species known to be dis- 
rupted in schizophrenia. Applying this new algorithm 
in this way also identified novel pathways that may 
also be relevant to schizophrenia. In this way we iden- 
tified alterations in glutamate metabolism and meta- 
bolic pathways central to glutamatergic 
neurotransmission, alterations in arginine and proline 
metabolism and the disruption of a novel series of pur- 
ine reactions that may contribute to the PFC dysfunc- 
tion and cognitive deficits seen in schizophrenia. 



Methods 

Chemicals 

The solvents used for the study were purchased from 
the following sources: Acetonitrile, methanol and 
chloroform (Fisher Scientific, Leicestershire, UK) and 
formic acid (VWR, Poole, UK). All chemicals used were 
of analytical reagent grade. A Direct Q-S"^ water purifi- 
cation system (Millipore, Watford, UK) was used to pro- 
duce HPLC grade water which was used in all analysis. 
Standards for 90 common bio-molecules were also pur- 
chased which were used to characterize the ZIC-HILIC 
column (Sigma Aldrich, Dorset UK). 

Animals 

All experiments were completed using male Lister 
Hooded rats (Harlan-Olac, UK) housed under standard 
conditions (21°C, 45-65% humidity, 12-h dark/light cycle 
(lights on 0600/z) with food and drinking water available 
ad libitum). All manipulations were carried out at least 
1 week after entry into the facility and all experiments 
were carried out under the Animals (Scientific Proce- 
dures) Act 1986. Animals received either sub-chronic 
treatment with vehicle (0.9% saline, i.p., « = 5) or 
2.58w^./c^'^ PCP.HCl (i.p., Sigma Aldrich, UK) once 
daily for five consecutive days [n = 5). At 72 hours after 
the final drug treatment dose animals were sacrificed 
and the brain rapidly dissected out and frozen in isopen- 
tane (-40°C) and stored at -80°C until sectioning. Frozen 
brains were sectioned (20 nM) in the coronal plane in a 
cryostat (-20°C). Tissue sections from the prefrontal cor- 
tex (PFC, Bregma +4.70mw to Bregma +3.20mw) were 
collected in 4w/ glass vials with reference to a stereotac- 
tic rat brain atlas [74] and stored at -80°C until further 
preparation for LC-MS analysis. 

Extraction of polar metabolites from brain samples for 
LC-MS analysis 

Extraction of polar metabolites from brain tissue was 
carried out using the two-step extraction method 
described previously [75], using methanol, water and 
chloroform for the optimal extraction of polar metabo- 
lites. A hand held homogenizer was used to homogenize 
the samples once in solution. For preparation of samples 
for LC-MS analysis 200 i^l of the collected polar extract 
was added to 600 ^/ of 1 : 1 acetonitrile:water solution 
to produce a final solvent:sample ratio of 3 : 1. The 
samples were then filtered using Acrodisc l?>mm syringe 
filters with 0.2 nm nylon membrane (Sigma Aldrich) 
before LC-MS analysis. 

LC-MS analysis of polar metabolites 

Experiments were carried out using a Finnigan LTQ 
Orbitrap (Thermo Fisher, Hemel Hempstead, UK) using 
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30000 resolution. Analysis was carried out in positive 
mode over a mass range of 60-1000 m/z. The capillary 
temperature was set at 250°C and in positive ionization 
mode the ion spray voltage was 4.5 kV , the capillary 
voltage 30 V and the tube lens voltage 105 V . The 
sheath and auxiliary gas flow rates were 45 and 15, 
respectively (units not specified by manufacturer). A 
ZIC-HILIC column (5 /^m, 150 x 4.6 mm; HiChrom, 
Reading, UK) was used in all analysis and a binary gradi- 
ent method was developed which produced good polar 
metabolite separation. Solvent A was 0.1% v/v formic 
acid in HPLC grade water and solvent B was 0.1% v/v 
formic acid in acetonitrile. A flow rate of 0.3 mllmin. 
was used and the injection volume was 10 The gradi- 
ent programme used was 80% B at 0 min. to 50% B at 
12 min. to 20% B at 28 min. to 80% B at 37 min., with 
total run time of 45 minutes. The instrument was exter- 
nally calibrated before analysis and internally calibrated 
using lock masses at miz 83.06037 and miz 195.08625. 
Samples were analysed sequentially and the vial tray 
temperature was set at a constant temperature of 4°C. 

Data preparation and analysis 

Determination of overt alterations in metabolite levels 

between experimental groups 

The software program Xcalibur (version 2.0) was used 
to acquire the LC-MS data. The raw Xcalibur data files 
from version 1.2 (Thermo Fisher, Hemel Hempstead, 
UK). SIEVE software (Thermo-Fisher Scientific) was 
used to identify all metabolites affected by drug treat- 
ment by calculating a p-vahie and ratio based on the dif- 
ference in average intensities of individual peaks, which 
correspond to different metabolites, between PCP-trea- 
ted and control animals. A significant difference in the 
level of each metabolite between groups was set at p 
-value < 0.05 and/or ratio less than 0.5 for downregu- 
lated metabolites and greater than 2 for upregulated 
metabolites. The ratio is the fold change in average peak 
intensities from control and treatment groups. For 
metabolite identification the masses of the polar meta- 
bolites were compared to the exact masses of 6000 bio- 
molecules using an in-house developed macro (Excel, 
Microsoft 2007). 
Hypergeometric probability testing 

The hypergeometric probability test was used to calcu- 
late the probability of finding at least the observed num- 
ber of metabolites of a given pre-defined metabolic 
pathway (as defined on the KEGG pathway database) in 
the clusters identified through the GSVD algorithm, 
with knowledge of the total number of metabolites pre- 
sent in that pathway detected by LC-MS in these sam- 
ples. The hypergeometric probability test was used to 
identify whether any of the KEGG defined metabolic 
pathways were significantly over-represented in any of 



the GSVD identified clusters. In its general form hyper- 
geometric probability allows the calculation of the prob- 
ability of observing at least (k) metabolites from a given 
defined KEGG pathway in a defined cluster of metabo- 
lites (h) given the total number of metabolites (N) and 
the total number of metabolites from the pathway in 
question (m). The probability mass function of hyper- 
geometric distribution is: 



/(fe;N, m,n) =P{X = k) = 




So here the probability is calculated using the formula 

n=<'-")-T.^^- ('») 

i=fe Vn ) 

Significant over-representation of a given functional 
group in any GSVD defined significant cluster was set 
by a hypergeometric probability threshold of 0.05. 

Additional material 



Additional file 1: Table SI - List of all metabolites detected by LC- 
MS in the PFC of Control and PCP-treated animals. Table SI Legend: 
The molecular formula and tentative molecular identity for each 
metabolite detected in the PFC of control and PCP-treated animals is 
shown. In addition, the KEGG molecular identity and the KEGG metabolic 
pathways in which a metabolite is involved are also shown. The ratio 
difference in metabolite concentration and the significance of this 
change (p-value), as determined by SIEVE analysis (see Methods section), 
are also shown. Those metabolites found to be significantly different 
between the two groups are highlighted in bold. The most prominent 
alterations in KEGG defined metabolic pathways appeared to be in (i) 
alanine, aspartate and glutamate metabolism (3 metabolites [ko00250]), 
(ii) phenylalanine, tyrosine and tryptophan metabolism (3 metabolites 
[ko00360]), (ill) purine metabolism (2 metabolites [ko00230]) and (Iv) 
butanoate metabolism (2 metabolites [ko00650]). KEGG defined 
metabolic pathways: ko00250: Alanine, Aspartate and Glutamate 
metabolism: ko00627: Aminobenzoate degradation: ko00330: Arginine 
and Proline metabolism: ko00410: beta-Alanine metabolism: ko00780: 
Biotin metabolism: map00650: Butanoate metabolism: ko04973: 
Carbohydrate metabolism: ko00270: Cysteine and Methionine 
metabolism: ko00071: Fatty acid metabolism: koOOOSl: Fructose and 
Manose metabolism: ko00052: Galactose metabolism: ko00471: 
Glutamine and Glutamate metabolism: k00480: Glutathione metabolism: 
ko00561: Glycerolipid metabolism: ko00564: Glycerophospholipid 
metabolism: ko00260: Glycine, Serine and Threonine metabolism: 
koOOOlO: Glycolysis/Gluconeogenesis: ko00340: Histidine metabolism: 
ko00562: Inositol Phosphate metabolism: map00300: Lysine 
biosynthesis: ko00310: Lysine degradation; ko00430: Methionine 
metabolism: ko04080: Neuroactive ligand-receptor interaction: ko00760: 
Nicotinate and Nicotinamide metabolism: ko00190: Oxidative 
phosphorylation; ko00770: Pantoate and CoA biosynthesis; koOOSSO: 
Peptidoglycan biosynthesis; ko00360: Phenylalanine metabolism; 
ko00400: Phenylalanine, Tyrosine and Tryptophan biosynthesis; 
ko00440: Phosphonate and Phosphinate metabolism; ko00640: 
Propanoate metabolism; ko00230: Purine metabolism; ko00240: 
Pyrimidine metabolism; ko00620: Pyruvate metabolism; koOOSOO: Starch 
and Sucrose metabolism; ko00600: Sphingolipid metabolism; ko00920: 
Sulphur metabolism: ko00430: Taurine and Hypotaurine metabolism; 
ko00730: Thiamine metabolism: ko00380: Tryptophan metabolism; 
ko00350: Tyrosine metabolism; ko00400: Tyrosine and Tryptophan 
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biosynthesis; ko00290; Valine, Leucine and Isoleucine biosynthesis; 
ko00280: Valine, Leucine and Isoleucine degradation. NA denotes a 
metabolite not associated with a KEGG compound ID or KEGG pathway. 

Additional file 2: 98 x 98 matrix of between metabolite correlations 
in the PFC of control animals. The 98 x 98 matrix of the Pearson's 
correlation coefficients (Fisher z-transformed) between all metabolites 
detected in the prefrontal cortex of control (saline-treated) animals by 
LC-MS analysis is shown. 

Additional file 3: 98 x 98 matrix of between metabolite correlations 
in the PFC of PCP-treated animals. The 98 x 98 matrix of the Pearson's 
correlation coefficient (Fisher z-transformed) between all metabolites 
detected in the prefrontal cortex of PCP-treated animals by LC-MS 
analysis is shown. 

Additional file 4: Table S2 - Table showing the axes labels in 
Figures 9, lOand 1 1. In Table S2 the position of each metabolite in the 
original ordering (Figure 4) is shown. In the columns for Figures 5 and 6, 
the corresponding numbers indicating the new position of each 
metabolite (node) in the matrix when reordered by the first column of X 
and the final column of X"'^ respectively, is shown. 
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